{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "6b0186a4",
   "metadata": {},
   "source": [
    "<a href=\"https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/examples/tools/eval_query_engine_tool.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b50c4af8-fec3-4396-860a-1322089d76cb",
   "metadata": {},
   "source": [
    "# Evaluation Query Engine Tool\n",
    "\n",
    "In this section we will show you how you can use an `EvalQueryEngineTool` with an agent. Some reasons you may want to use a `EvalQueryEngineTool`:\n",
    "1. Use specific kind of evaluation for a tool, and not just the agent's reasoning\n",
    "2. Use a different LLM for evaluating tool responses than the agent LLM\n",
    "\n",
    "An `EvalQueryEngineTool` is built on top of the `QueryEngineTool`. Along with wrapping an existing [query engine](https://docs.llamaindex.ai/en/stable/module_guides/deploying/query_engine/root.html), it also must be given an existing [evaluator](https://docs.llamaindex.ai/en/stable/examples/evaluation/answer_and_context_relevancy.html) to evaluate the responses of that query engine.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "db402a8b-90d6-4e1d-8df6-347c54624f26",
   "metadata": {},
   "source": [
    "## Install Dependencies"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "dd31aff7",
   "metadata": {},
   "outputs": [],
   "source": [
    "%pip install llama-index-embeddings-huggingface\n",
    "%pip install llama-index-llms-openai\n",
    "%pip install llama-index-agents-openai"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9f9fcf29",
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "os.environ[\"OPENAI_API_KEY\"] = \"sk-...\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7603dec1",
   "metadata": {},
   "source": [
    "## Initialize and Set LLM and Local Embedding Model\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "05fd9050",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core.settings import Settings\n",
    "from llama_index.embeddings.huggingface import HuggingFaceEmbedding\n",
    "from llama_index.llms.openai import OpenAI\n",
    "\n",
    "Settings.embed_model = HuggingFaceEmbedding(\n",
    "    model_name=\"BAAI/bge-small-en-v1.5\"\n",
    ")\n",
    "Settings.llm = OpenAI()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6c6bdb82",
   "metadata": {},
   "source": [
    "## Download and Index Data\n",
    "This is something we are donig for the sake of this demo. In production environments, data stores and indexes should already exist and not be created on the fly."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "64df0568",
   "metadata": {},
   "source": [
    "### Create Storage Contexts"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "91618236-54d3-4783-86b7-7b7554efeed1",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core import (\n",
    "    StorageContext,\n",
    "    load_index_from_storage,\n",
    ")\n",
    "\n",
    "try:\n",
    "    storage_context = StorageContext.from_defaults(\n",
    "        persist_dir=\"./storage/lyft\",\n",
    "    )\n",
    "    lyft_index = load_index_from_storage(storage_context)\n",
    "\n",
    "    storage_context = StorageContext.from_defaults(\n",
    "        persist_dir=\"./storage/uber\"\n",
    "    )\n",
    "    uber_index = load_index_from_storage(storage_context)\n",
    "\n",
    "    index_loaded = True\n",
    "except:\n",
    "    index_loaded = False"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "6a79cbc9",
   "metadata": {},
   "source": [
    "Download Data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "36d80144",
   "metadata": {},
   "outputs": [],
   "source": [
    "!mkdir -p 'data/10k/'\n",
    "!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/10k/uber_2021.pdf' -O 'data/10k/uber_2021.pdf'\n",
    "!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/10k/lyft_2021.pdf' -O 'data/10k/lyft_2021.pdf'"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4f801ac5",
   "metadata": {},
   "source": [
    "### Load Data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d3d0bb8c-16c8-4946-a9d8-59528cf3952a",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core import SimpleDirectoryReader, VectorStoreIndex\n",
    "\n",
    "if not index_loaded:\n",
    "    # load data\n",
    "    lyft_docs = SimpleDirectoryReader(\n",
    "        input_files=[\"./data/10k/lyft_2021.pdf\"]\n",
    "    ).load_data()\n",
    "    uber_docs = SimpleDirectoryReader(\n",
    "        input_files=[\"./data/10k/uber_2021.pdf\"]\n",
    "    ).load_data()\n",
    "\n",
    "    # build index\n",
    "    lyft_index = VectorStoreIndex.from_documents(lyft_docs)\n",
    "    uber_index = VectorStoreIndex.from_documents(uber_docs)\n",
    "\n",
    "    # persist index\n",
    "    lyft_index.storage_context.persist(persist_dir=\"./storage/lyft\")\n",
    "    uber_index.storage_context.persist(persist_dir=\"./storage/uber\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ccb89178",
   "metadata": {},
   "source": [
    "## Create Query Engines"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "31892898-a2dc-43c8-812a-3442feb2108d",
   "metadata": {},
   "outputs": [],
   "source": [
    "lyft_engine = lyft_index.as_query_engine(similarity_top_k=5)\n",
    "uber_engine = uber_index.as_query_engine(similarity_top_k=5)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "880c2007",
   "metadata": {},
   "source": [
    "## Create Evaluator"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "911235b3",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core.evaluation import RelevancyEvaluator\n",
    "\n",
    "evaluator = RelevancyEvaluator()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "60c542c1",
   "metadata": {},
   "source": [
    "## Create Query Engine Tools"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f9f3158a-7647-4442-8de1-4db80723b4d2",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core.tools import ToolMetadata\n",
    "from llama_index.core.tools.eval_query_engine import EvalQueryEngineTool\n",
    "\n",
    "query_engine_tools = [\n",
    "    EvalQueryEngineTool(\n",
    "        evaluator=evaluator,\n",
    "        query_engine=lyft_engine,\n",
    "        metadata=ToolMetadata(\n",
    "            name=\"lyft\",\n",
    "            description=(\n",
    "                \"Provides information about Lyft's financials for year 2021. \"\n",
    "                \"Use a detailed plain text question as input to the tool.\"\n",
    "            ),\n",
    "        ),\n",
    "    ),\n",
    "    EvalQueryEngineTool(\n",
    "        evaluator=evaluator,\n",
    "        query_engine=uber_engine,\n",
    "        metadata=ToolMetadata(\n",
    "            name=\"uber\",\n",
    "            description=(\n",
    "                \"Provides information about Uber's financials for year 2021. \"\n",
    "                \"Use a detailed plain text question as input to the tool.\"\n",
    "            ),\n",
    "        ),\n",
    "    ),\n",
    "]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "275c01b1-8dce-4216-9203-1e961b7fc313",
   "metadata": {},
   "source": [
    "## Setup OpenAI Agent"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ded93297-fee8-4329-bf37-cf77e87621ae",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core.agent.workflow import FunctionAgent\n",
    "from llama_index.llms.openai import OpenAI\n",
    "\n",
    "agent = FunctionAgent(tools=query_engine_tools, llm=OpenAI(model=\"gpt-4.1\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "48eec4e4",
   "metadata": {},
   "source": [
    "## Query Engine Passes Evaluation\n",
    "\n",
    "Here we are asking a question about Lyft's financials. This is what we should expect to happen:\n",
    "1. The agent will use the `lyftk` tool first\n",
    "2. The `EvalQueryEngineTool` will evaluate the response of the query engine using its evaluator\n",
    "3. The output of the query engine will pass evaluation because it contains Lyft's financials"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7b114dd1",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Added user message to memory: What was Lyft's revenue growth in 2021?\n",
      "=== Calling Function ===\n",
      "Calling function: lyft with args: {\"input\": \"What was Lyft's revenue growth in 2021?\"}\n",
      "Got output: Lyft's revenue growth in 2021 was $3,208,323, which increased compared to the revenue in 2020 and 2019.\n",
      "========================\n",
      "\n",
      "=== Calling Function ===\n",
      "Calling function: uber with args: {\"input\": \"What was Lyft's revenue growth in 2021?\"}\n",
      "Got output: Could not use tool uber because it failed evaluation.\n",
      "Reason: NO\n",
      "========================\n",
      "\n",
      "Lyft's revenue grew by $3,208,323 in 2021, which increased compared to the revenue in 2020 and 2019.\n"
     ]
    }
   ],
   "source": [
    "response = await agent.run(\"What was Lyft's revenue growth in 2021?\")\n",
    "print(str(response))"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
